The Power of Sampling and Stacking for the PAKDD-2007 Cross-Selling Problem

نویسندگان

  • Paulo J. L. Adeodato
  • Germano C. Vasconcelos
  • Adrian L. Arnaud
  • Rodrigo C. L. V. Cunha
  • Domingos S. M. P. Monteiro
  • Rosalvo F. Oliveira Neto
چکیده

This article presents an efficient solution for the PAKDD-2007 Competition cross-selling problem. The solution is based on a thorough approach which involves the creation of new input variables, efficient data preparation and transformation, adequate data sampling strategy and a combination of two of the most robust modeling techniques. Due to the complexity imposed by the very small amount of examples in the target class, the approach for model robustness was to produce the median score of the 11 models developed with an adapted version of the 11-fold cross-validation process and the use of a combination of two robust techniques via stacking, the MLP neural network and the n-tuple classifier. Despite the problem complexity, the performance on the prediction data set (unlabeled samples), measured through KS2 and ROC curves was shown to be very effective and finished as the first runner-up solution of the competition. The Power of Sampling and Stacking for the PAKDD-2007 Cross-Selling Problem

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Solution to the Cross-Selling Problem of PAKDD-2007

Our team has won the Grand Champion (Tie) of PAKDD-2007 data mining competition. The data mining task is to score credit card customers of a consumer finance company according to the likelihood that customers take up the home loans offered by the company. This report presents our solution for this business problem. TreeNet and logistic regression are the data mining algorithms used in this proj...

متن کامل

Ranking Potential Customers Based on Group-Ensemble

Ranking potential customers has become an effective tool for company decision makers to design marketing strategies. The task of PAKDD competition 2007 is a cross-selling problem between credit card and home loan, which can also be treated as a ranking potential customers problem. This article proposes a 3-level ranking model, namely Group-Ensemble, to handle such kinds of problems. In our mode...

متن کامل

Selecting Salient Features and Samples Simultaneously to Enhance Cross-Selling Model Performance

The rapid growth in information science and technology has lead to generation of huge amount of valuable data in many areas. In finance for example, over the past five years, many banks have experienced exceptional growth in service and have built up bank’s Group Data Warehouse. In order to realize faster, more effective decisions and provide more excellent customer services, new technologies t...

متن کامل

رویکرد تحلیلی در تعیین میزان بهینه ی قراردادهای پیش فروش

Due to volatility of spot power prices and in order to manage risk of a Generating Company (GenCo), this paper addresses determination of optimal quantity of bilateral forward contracts which can be formulated as an optimization problem. In this framework, in addition to selling electricity to spot market, bilateral forward contracts can be traded between Gencos and customers. Finding an optima...

متن کامل

Economic Evaluation of Optimal Capacitor Placement in Reconfiguration Distribution System Using Genetic Algorithm

Optimal capacitor placement, considering power system loss reduction, voltage profile improvement, line reactive power decrease and power factor correction, is of particular importance in power system planning and control. The distribution system operator calculates the optimal place, number and capacity of capacitors based on two major purposes: active power loss reduction and return on invest...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • IJDWM

دوره 4  شماره 

صفحات  -

تاریخ انتشار 2008